This file contains information on replicating he results in Kastellec, Jonathan,  Andrew Gelman and Jamie Chandler. 2008.  Predicting and Dissecting the Seats-Votes Curve in the 2006 U.S. House Election PS: Political Science & Politics. 41(1):139-145.


Datasets

We used three datasets in the paper: a district-level dataset containing information on every election in each House election from 1946 to 2004; an aggregate-level dataset containing information on the total number of votes and seats gained by each party in the same elections; and a dataset containing information on each district that we used to make predictions for the 2006 election


a)     Individual House Races Data, 1946-2004

This dataset "jacobson_data.dta", which was given to us by Gary Jacobson, contains various information on every House race from 1946-2004, such as the vote share of the Democratic candidate and incumbency status; complete coding information is available in "Jacobson_coding.DOC". We modified and recoded this data using the Stata do-file "all_years_data_recoding.do". Coding information for the updated dataset, which we use for the analysis that appears in the paper, is available in "1946-2004_coding_updated.DOC."

b)    Aggregate House Data, 1946-2004

The dataset "House_1946-2006_aggregate.dta", which was compliled based on data available from the Clerk of the House, contains aggregate information (in terms of seats and votes) for every House election from 1946-2004. Coding information is available in "1946-2004_coding_updated.DOC."

c)     Individual House Race Data for Predicting 2006

The dataset "2006_house_data.dta" contains information about the 2006 election, including incumbency status lagged vote leading up to the election, along with information about the winner and vote margins in the 2006 election. Data on 2004 vote shares and incumbency status was based on Jacobson’s data.Data on incumbency status and retirements was taken from various news sources in the months leading up to the election (see paper for references). And the 2006 election results were supplied to us Walt Borges, who gathered the official certified results of every state, which we then confirmed independently. Coding information is available in "2006_coding.DOC".

Statistical Code

All statistical analysis that appears in the paper was conducted using R. Complete, annotated code is in the script "house_2006_script_web.R"

